Design and Experience: Using the Intel® Itanium® 2 Processor Performance Monitoring Unit to Implement Feedback Optimizations

نویسندگان

  • Youngsoo Choi
  • Allan Knies
  • Geetha Vedaraman
  • Jeremiah Williamson
چکیده

Historically, profile-guided optimization has gathered its profile data by executing an instrumented binary and capturing the output. While this approach enables the collection of function and basic block frequencies, it cannot extract microarchitectural event information such as cache activity, TLB activity, and branch prediction behavior. Using instrumentation also requires that programs be compiled with different options (one for the profile run, one for the optimization run) with the profiling run taking substantially longer due to instrumentation overhead and reductions in compiler optimization. To help address these issues, the Intel® Itanium® 2 processor has extensive hardware support to allow for highly accurate instruction-specific information to be gathered from any binary. In this paper, we cover three broad topics: the Itanium® 2 processor performance monitoring unit (PMU), our tools and methodology to gather and process cache, TLB, and branch activity information, and a case study where we demonstrate the entire system to reduce data access stalls.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Itanium 2 Processor Extends the Processing Power of the Itanium Processor Family with a Capable and Balanced Microarchitecture. Executing up to Six Instructions at a Time, It Provides Both Performance and Binary Compatibility

On 8 July 2002, Intel introduced the Itanium 2 processor—the Itanium architecture’s second implementation. This event was a milestone in the cooperation between Intel and Hewlett-Packard to establish the Itanium architecture as a key workstation, server, and supercomputer building block. The Itanium 2 processor may appear similar to the Itanium processor, yet it represents significant advances ...

متن کامل

A 32nm 3.1 billion transistor 12-wide-issue Itanium® processor for mission-critical servers

July 2011 Revision 1.1 Poulson High Level Summary The next generation in the Intel® Itanium® processor family, code named Poulson, has eight multi-threaded 64 bit cores. Poulson is socket compatible with the current Intel® Itanium® Processor 9300 series (Tukwila) [1]. The new design integrates a ring based system interface derived from portions of previous Xeon® and Itanium® processors, and inc...

متن کامل

Practical Compiler Techniques on Efficient Multithreaded Code Generation for OpenMP Programs

State-of-the-art multiprocessor systems pose several difficulties: (i) the user has to parallelize the existing serial code; (ii) explicitly threaded programs using a thread library are not portable; (iii) writing efficient multi-threaded programs requires intimate knowledge of machine’s architecture and micro-architecture. Thus, well-tuned parallelizing compilers are in high demand to leverage...

متن کامل

Compilation for the Itanium Processor

This paper describes a just-in-time (JIT) Java1 compiler for the Intel Itanium processor. The Itanium processor is an example of an Explicitly Parallel Instruction Computing (EPIC) architecture and thus relies on aggressive and expensive compiler optimizations for performance. Static compilers for Itanium use aggressive global scheduling algorithms to extract instruction-level parallelism. In a...

متن کامل

On the Predictability of Program Behavior Using Different Input Data Sets

Smaller input data sets such as the test and the train input sets are commonly used in simulation to estimate the impact of architecture/micro-architecture features on the performance of SPEC benchmarks. They are also used for profile feedback compiler optimizations. In this paper, we examine the reliability of reduced input sets for performance simulation and profile feedback optimizations. We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002